Journal of the Association for Research in Otolaryngology
○ Springer Science and Business Media LLC
Preprints posted in the last 90 days, ranked by how well they match Journal of the Association for Research in Otolaryngology's content profile, based on 11 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Sivaprakasam, A.; Schweinzger, I.; Heinz, M.
Show abstract
Aging and noise over-exposure lead to complex mixtures of cochlear degradation that impair the structure and function of outer hair cells, inner hair cells (IHCs), and the cochlear nerve. However, IHC damage and cochlear synaptopathy (CS) remain pathologies "hidden" from the audiogram. This study aimed to identify and differentiate the physiological signatures of these two distinct pathologies using promising non-invasive assays: Envelope Following Responses (EFRs), Auditory Brainstem Response (ABRs), Wideband middle-ear reflexes (WB-MEMRs), and Distortion Product Otoacoustic Emissions (DPOAEs). We utilized chinchilla models of carboplatin-induced (CA) IHC damage (N = 4) and temporary threshold shift (TTS) noise-induced CS (N = 4) to compare the physiological signatures of each pathology. While both groups showed unchanged ABR thresholds two weeks after exposure, EFRs, ABR Wave V/I ratios, and MEMRs showed distinct effects of exposure. Despite non-elevated ABR-derived audiometric thresholds after exposure, both CA and TTS exposure resulted in severe in EFR "peakiness", particularly for sharp, short-duty-cycle stimuli and significant elevations in ABR Wave V/I ratios. However, these findings were less-pronounced in the TTS-exposed animals. WB-MEMR amplitudes were decreased with elevated thresholds in both groups; this effect was more pronounced in the TTS group. Opposite trends in DPOAE amplitudes indicated that while both IHC damage and CS result in similar suprathreshold temporal coding deficits, effects on outer-hair-cell integrity and auditory efferent physiology may differ between the two pathologies. Future work and novel diagnostics should aim to distinguish these specific cochlear pathologies in clinical populations, or at the very least consider their overlap. HighlightsO_LIA multi-metric diagnostic approach was used with chinchilla models of inner-hair-cell (IHC) damage and cochlear synaptopathy (CS). C_LIO_LIIHC damage and synaptopathy both cause suprathreshold deficits "hidden" from the audiogram. C_LIO_LIIHC damage results in more severe temporal envelope coding degradation than does synaptopathy. C_LIO_LIA combination of EFR "peakiness", ABR Wave V/I ratio, and Wideband Middle Ear Muscle Reflex (WB-MEMR) appear to be useful measures for profiling IHC damage and CS. C_LI
Devolder, P.; Deloche, F.; Thienpont, M.; Keppler, H.; Verhulst, S.
Show abstract
The middle ear muscle reflex (MEMR) and medial olivocochlear reflex (MOCR) are increasingly studied for their role in suprathreshold auditory processing. However, recording these reflexes in humans is potentially complicated by age-related (sub)clinical hearing loss and co-activation. This study investigates (1) the influence of age-related (sub)clinical hearing loss, (2) methodological differences between conventional and wideband MEMR techniques, and (3) how MEMR activation contaminates MOCR recordings. Three test groups were included: young normal-hearing adults, middle-aged normal-hearing adults, and middle-aged adults with audiometric hearing loss. Cochlear status and neural encoding was assessed using distortion-product otoacoustic emissions (DPOAEs) and envelope following responses (EFRs). MEMR recordings were compared using conventional tonal stimuli and wideband stimuli. MOCR was recorded at elicitor levels of 60 and 75 dB to evaluate MEMR co-activation. MEMR was related to age, suggesting sensitivity to subclinical cochlear damage. Wideband stimuli were beneficial as elicitor (noise vs. tone), while changing the probe stimuli added no significant benefit (click vs. tone). MOCR strength did not correlate with age-related subclinical hearing, suggesting that MOCR measurements may reflect efferent function relatively independently of afferent sensorineural status in audiometric normal hearing subjects. However, reliable recordings were challenging in participants with audiometric hearing loss due to poor OAE baselines. MEMR co-activation was detectable in the click response and could alter MOCR-induced suppression. These findings suggest that, in cases of normal hearing thresholds, MEMR amplitude may be a marker of subclinical cochlear damage and MOCR measurements may more specifically reflect efferent function. Clinical measurements can be improved using broadband stimuli, accounting for outer-hair-cell damage, and defining criteria for reflex co-activation.
Sotero Silva, N.; Kayser, C.
Show abstract
Recent studies describe Eye Movement-related Eardrum Oscillations (EMREOs), low-frequency signals recorded in the ear canal that arise from the tympanic membrane and are triggered by saccadic eye movements. Because EMREOs are thought to arise from motor elements in the peripheral auditory system, we examined how two known modulators of these elements affect the EMREO time course. First, the activity of outer hair cells (OHC) can be suppressed by the medial olivocochlear reflex (MOCR). If OHCs contribute to the generation of EMREOs, activation of this reflex should reduce EMREO amplitude. To test this, we compared EMREO amplitudes elicited by saccades performed in silence and in the presence of contralateral noise. Second, gravitational cues linked to head orientation may influence EMREOs via oculomotor control circuits that possibly modulate middle ear muscles. To test this, we recorded EMREOs while participants made saccades with their head upright (0{degrees} azimuth) and with their head tilted 30{degrees} in either direction. Across both experiments our data reveal no clear modulation of the EMREO time course by these experimental manipulations. Together with other recent studies these findings advocate for a stability of the EMREO time course towards multiple experimental modulations and fuel speculations that the signal may serve as a temporal reference frame when combining signals across the senses.
Kamau, A. F.; Merchant, G. R.; Nakajima, H. H.; Neely, S. T.
Show abstract
Conductive hearing loss (CHL) with a normal otoscopic exam can be difficult to diagnose because routine clinical measures such as audiometric air-bone gaps (ABGs) can identify a conductive component but often cannot distinguish among specific underlying mechanical pathologies (e.g., stapes fixation versus superior canal dehiscence, which may produce similar audiograms). Wideband tympanometry (WBT) is a fast, noninvasive test that can provide additional mechanical information across a broad range of frequencies (200 Hz to 8 kHz). However, WBT metrics are influenced by variations in ear canal geometry and probe placement and can be challenging to interpret clinically. In this study, we extend prior WBT absorbance-based classification work by estimating the middle ear input impedance at the tympanic membrane (ZME), a WBT-derived metric intended to reduce ear canal effects. To estimate ZME, we fit an analog circuit model of the ear canal, middle ear, and inner ear to raw WBT data collected at tympanometric peak pressure (TPP). Data from 27 normal ears, 32 ears with superior canal dehiscence, and 38 ears with stapes fixation were analyzed. A multinomial logistic regression classifier was trained using principal component analysis (retaining 90% variance) and stratified 5-fold cross-validation with regularization. We compared feature sets based on ABGs alone, ABGs combined with absorbance, and ABGs combined with the magnitude of ZME. The combination of ABGs and the magnitude of ZME produced the best performance, achieving an overall accuracy of 85.6% compared to 80.4% for ABGs alone and 78.4% for ABGs combined with absorbance. These results suggest that incorporating model-derived middle ear impedance features with standard audiometric measures (ABGs) can improve automated pathology classification for stapes fixation and superior canal dehiscence.
Augsten, M.-L.; Lindenbeck, M. J.; Laback, B.
Show abstract
Cochlear implant (CI) users typically experience difficulties perceiving musical harmony due to a restricted spectro-temporal resolution at the electrode-nerve interface, resulting in limited pitch perception. We investigated how stimulus parameters affect discrimination of complex-tone triads (three-voice chords), aiming to identify conditions that maximize perceptual sensitivity. Six post-lingually deafened CI listeners completed a same/different task with harmonic complex tones, while spectral complexity, voice(s) containing a pitch change, and temporal synchrony (simultaneous vs. sequential triad presentation) were manipulated. CI listeners discriminated harmonically relevant one-semitone pitch changes within triads when spectral complexity was reduced to three or five components per voice, with significantly better performance for three-component compared to nine-component tones. Sensitivity was observed for pitch changes in the high voice or in both high and low voices, but not for changes in only the low voice. Single-voice sensitivity predicted simultaneous-triad sensitivity when controlling for spectral complexity and voice with pitch change. Contrary to expectations, sequential triad presentation did not improve discrimination. An analysis of processor pulse patterns suggests that difference-frequency cues encoded in the temporal envelope rather than place-of-excitation cues underlie perceptual triad sensitivity. These findings support reducing spectral complexity to enhance chord discrimination for CI users based on temporal cues.
Neely, S. T.; Harris, S. E.; Hajicek, J. J.; Petersen, E. A.; Shen, Y.
Show abstract
In a loudness-matching paradigm, a reduction in the loudness of sounds with bandwidths less than one-half octave compared to a tone of equal sound pressure level has been observed previously for five-tone complexes at 60 dB SPL centered at 1 kHz. Here, this loudness-reduction phenomenon is explored using band-limited noise across wide ranges of frequency and level. Additionally, these measurements are simulated by a model of loudness judgement based on neural ensemble averaging (NEA), which serves as a proxy for central auditory signal processing. Multi-frequency equal-loudness contours (ELC) were measured for each of the adult participants (N=100) with pure-tone average (PTA) thresholds that ranged from normal to moderate hearing loss using a categorical-loudness-scaling (CLS) paradigm. Presentation level and center frequency of the test stimuli were determined on each trial according to a Bayesian adaptive algorithm, which enabled multi-frequency ELC estimation within about five minutes of testing. Three separate test conditions differed by stimulus type: (1) pure-tone, (2) quarter-octave noise and (3) octave noise. For comparison, loudness judgements for all three stimulus types were also simulated by the NEA model, which comprised a nonlinear, active, time-domain cochlear model with an appended stage of neural spike generation. Mid-bandwidth loudness reduction was observed to be greatest at moderate stimulus levels and frequencies near 1 kHz. This feature was approximated by the NEA model, which suggests involvement of an early stage of the central auditory system in the formation of loudness judgements.
Caro, A. M.; Zhang, Z.; Gansemer, B. M.; Green, S. H.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWSpiral ganglion neurons (SGNs) constitute the sole afferent connection between cochlear hair cells and central auditory nuclei. SGNs die during postnatal developmental pruning, and also following hair cell death, which can be triggered by ototoxic agents such as aminoglycoside antibiotics, including kanamycin. After hair cell loss, animal models show extensive SGN degeneration occurring gradually over a period of weeks to months. Here, we compared spatial and temporal patterns of SGN loss and immune cell involvement in these two cases of cell death in rats. Developmental SGN pruning occurred from postnatal day 5 (P5) to P8 in the basal half of the cochlea, and from P5 to P12 in the apical half. This was accompanied by a transient increase in spiral ganglion macrophages temporally and spatially correlated with SGN death, consistent with a role clearing degenerating neurons. After deafening neonatal rats with kanamycin injections, SGN death became evident at approximately 5.5 weeks of age and persisted throughout the ganglion, with greatest loss in the middle regions; less in the base and apex. Macrophage numbers also increased but neither temporally nor spatially correlated with SGN death. Rather, increased macrophage number and activation began approximately three weeks before SGN death and was highest in the apex. Additionally, T-cells and NK cells appeared in the ganglion concurrently with SGN degeneration. These observations suggest fundamentally different roles for macrophages post-deafening than during developmental pruning and, with prior observations that anti-inflammatory drugs reduce SGN death, support a causal role for immune responses in SGN death post-deafening.
Lien, J. T.-H.; Strahl, S.; Garcia, C.; Vickers, D.
Show abstract
The human auditory system decomposes complex sounds into distinct components via a collection of processing steps. Knowing whether Spiral Ganglion Cells (SGCs) play an active role in the decoding of complex sounds can facilitate the development of Cochlear Implant (Cl) coding strategies and clinical assessment tools. Early animal studies reported SGCs being similar across different characteristic frequencies (CFs). In this study, human electrically evoked compound action potentials (eCAPs) were analysed to probe the relationship between the reciprocal of CF and the duration of the eCAP. A significant relationship could indicate that SGCs may not simply be passive cables. eCAP datasets from 6 published studies (175 Cl users, 1243 recordings) were analysed and their peaks were automatically labelled. The nlp2 latency was derived for each recording as a proxy of the action potential duration. The CF of each recording was estimated by mapping the average insertion angle of the electrode to the human SGC map. A weak but statistically significant relationship was observed between the n1p2 latency and the reciprocal of CF (random-effects model with random intercepts for subject, r = 0.09, p = 0.024, n= 450) supporting the hypothesis that lower CF is associated with slower repolarisation (longer n1p2 latency) in human spiral ganglion cells.
King, C. D.; Zhu, T.; Groh, J. M.
Show abstract
Information about eye movements is necessary for linking auditory and visual information across space. Recent work has suggested that such signals are incorporated into processing at the level of the ear itself (Gruters, Murphy et al. 2018). Here we report confirmation that the eye movement signals that reach the ear can produce perceptual consequences, via a case report of an unusual participant with tensor tympani myoclonus who hears sounds when she moves her eyes. The sounds she hears could be recorded with a microphone in the ear in which she hears them (left), and occurred for large leftward eye movements to extreme orbital positions of the eyes. The sounds elicited by this participants eye movements were reminiscent of eye movement-related eardrum oscillations (EMREOs, (Gruters, Murphy et al. 2018, Brohl and Kayser 2023, King, Lovich et al. 2023, Lovich, King et al. 2023, Lovich, King et al. 2023, Abbasi, King et al. 2025, Sotero Silva, Kayser et al. 2025, King and Groh 2026, Leon, Ramos et al. 2026, Sotero Silva, Brohl et al. 2026)), but were larger and longer lasting than classical EMREOs, helping to explain why they were audible to her. Overall, the observations from this patient help establish that (a) eye movement-related signals specifically reach the tensor tympani muscle and that (b) when there is an abnormality involving that muscle, such signals can lead to actual audible percepts. Given that the tensor tympani contributes to the regulation of sound transmission in the middle ear, these findings support that eye movement signals reaching the ear have functional consequences for auditory perception. The findings also expand the types of medical conditions that produce gaze-evoked tinnitus, to date most commonly observed in connection with acoustic neuromas.
Borrajo, M.; Callejo, A.; CASTELLANOS, E.; Amilibia, E.; Llorens, J.
Show abstract
Vestibular schwannomas (VS) cause vestibular function loss by mechanisms still poorly understood. We evaluated the vestibulo-ocular reflex by the video-assisted Head Impulse Test (vHIT) in patients with planned tumour resection by a trans-labyrinthine approach. The vestibular sensory epithelia were collected and processed by immunofluorescent labelling for confocal microscopy analysis of sensory hair cell subtypes (type I, HCI, and type II, HCII), calyx endings of the pure-calyx afferents, and the calyceal junction normally found between HCI and the calyx (n=23). Comparing Normofunction and Hypofunction patients, we concluded that worse vestibular function associates with decreased HCI and HCII counts in the sensory epithelia and with increased proportion of damaged calyces. A decrease in the number of HCI and calyx endings of the pure-calyx afferents was recorded to associate with age increase. Partial least squares regression (PLSR) models indicated that VS and age had independent, additive effects on vestibular function. Correlation analyses indicated that lower vHIT gains associate with lower numbers of HCI and increased percentages of damaged calyces. These data support the hypothesis that the deleterious effect of VS on vestibular function is mediated, at least in part, by its damaging impact on the vestibular sensory epithelium. They also provide further evidence for the dependency of the vestibulo-ocular reflex on HCI function and for the calyceal junction pathology as a common response of the sensory epithelium to HC stress.
Palou, A.; Tagliabue, M.; Beraneck, M.; Llorens, J.
Show abstract
The rat vestibular system plays a critical role in anti-gravity responses such as the tail-lift reflex and the air-righting reflex. In a previous study in male rats, we obtained evidence that these two reflexes depend on the function of non-identical populations of vestibular sensory hair cells (HC). Here, we caused graded lesions in the vestibular system of female rats by exposing the animals to several different doses of an ototoxic chemical, 3,3-iminodipropionitrile (IDPN). After exposure, we assessed the anti-gravity responses of the rats and then assessed the loss of type I HC (HCI) and type II HC (HCII) in the central and peripheral regions of the crista, utricle and saccule. As expected, we recorded a dose-dependent loss of vestibular function and loss of HCs. The relationship between hair cell loss and functional loss was examined using non-linear models fitted by orthogonal distance regression. The results indicated that both the tail-lift reflex and the air-righting reflexes mostly depend on HCI function. However, a different dependency was found on the epithelium triggering the reflex: while the tail-lift response is sensitive to loss of crista and/or utricle HCIs, the air-righting response rather depends on utricular and/or saccular integrity.
Motlagh Zadeh, L.; Izhiman, D.; Blankenship, C. M.; Moore, D. R.; Martin, D. K.; Garinis, A.; Feeney, P.; Hunter, L. R.
Show abstract
Objectives: Patients with Cystic fibrosis (CF) often receive aminoglycosides (AGs) to manage recurrent pulmonary infections, placing them at risk for ototoxicity. Chronic AG use can lead to complex cochlear damage affecting inner and outer hair cells, the stria vascularis, and spiral ganglion neurons. The greatest damage is typically in the basal cochlear region, which encodes high-frequency hearing, with additional involvement of more apical regions. While extended-high-frequency (EHF) hearing loss (EHFHL; 9-16 kHz) is often the earliest sign of AG ototoxicity, speech in noise (SiN) effects are rarely studied. Our overall hypothesis is that SiN perception difficulties in individuals with CF, treated with AGs, are related to combined cochlear and neural damage, primarily in the EHF range but also in the standard frequency (SF; 0.25-8 kHz) range. Three mechanisms that contribute to SiN perception were evaluated in children and young adults: 1) a primary effect of reduced EHF sensitivity, measured by pure-tone audiometry (PTA) and transient-evoked otoacoustic emissions (TEOAEs); 2) a secondary effect of subclinical damage in the SF range, measured by PTA and TEOAEs; and 3) additional neural effects, measured by middle ear muscle reflex (MEMR) threshold (afferent) and growth functions (efferent).Design:A total of 185 participants were enrolled; 101 individuals with CF treated with intravenous AGs and 84 age and sex-matched Controls without hearing concerns or CF. Assessments included EHF and SF PTA; the Bamford-Kowal-Bench (BKB)-SIN test for SiN perception; double-evoked TEOAEs with chirp stimuli from 0.71 to 14.7 kHz; and ipsilateral and contralateral wideband MEMR thresholds and growth functions using broadband stimuli. Results: Reduced sensitivity at EHFs (PTA, TEOAEs) was not associated with impaired SiN perception in the CF group. SF hearing, regardless of EHF status, was the primary predictor of SiN performance in the CF group. Increased MEMR growth was also significantly associated with poorer SiN in the CF group. Conclusions: In CF, impaired SiN perception was primarily predicted by SF hearing impairment, with additional involvement of the efferent auditory pathway through increased MEMR growth. These results build on prior evidence for efferent neural effects due to ototoxic exposures, supporting both sensory (afferent) and neural (efferent) mechanisms that contribute to listening difficulties in CF. Thus, preventive and intervention strategies should consider these combined mechanisms in people with AG ototoxicity to address their SiN problems.
Hajicek, J.; Harris, S. E.; Neely, S. T.
Show abstract
PurposeThis research sought to develop a low-cognitive-load speech-in-noise test based on consonant confusions with the potential for assessing hearing-aid benefit. MethodsVowel-consonant-vowel (VCV) stimuli with added speech-shaped noise were presented as a closed-set consonant identification task. Initially, consonant-confusion matrices were used to select, from a larger set of consonants and vowel contexts, a set of ten consonants and associated signal-to-noise ratios (SNR) that were sensitive to hearing loss. The sensitivity of the qVCV test to hearing loss was validated by comparing predicted pure-tone average (PTA) hearing thresholds with their audiometric PTA. Clinical viability of the qVCV test was assessed by comparisons to the QuickSIN test. Hearing-aid benefit was assessed by comparing test scores in unaided and aided conditions. ResultsThe consonants most sensitive to hearing loss were /b d g t k v z s [esh] n/ in the vowel context /[a]/. A cross-validated prediction of PTA had a mean-absolute error of 5.7 dB. The repeatability of qVCV at 50 trials was equivalent to the QuickSIN average of two lists. Hearing-aid benefit was quantified as a decibel reduction in hearing loss. ConclusionsqVCV and QuickSIN performed similarly when test times are equated. The advantages of qVCV include lower cognitive demand, fewer learning effects, and automated scoring. PTA predicted by qVCV which greatly exceeds audiometric PTA may indicate either cognitive deficits or cochlear neural degeneration. The qVCV quantification of hearing-aid benefit may have clinical value.
MacLean, J.; Zhou, M.; Bidelman, G.
Show abstract
Entrainment and predictive coding aid speech perception in both quiet and noisy environments. Isochronous, periodic auditory rhythmic cues facilitate entrainment and temporal expectations which can benefit encoding and perception of target speech. However, most studies using isochronous cues confound periodicity with predictability. To this end, we characterized how systematic changes in the acoustic dimensions of stimulus rate, target phase, periodicity, and predictably of an entraining sound precursor impact the subsequent identification of concurrent speech targets. Target concurrent vowel pairs were preceded by rhythmic woodblock cues which were either periodic-predictable (PP, isochronous rhythm), aperiodic-predictable (AP, accelerating rhythm), or aperiodic-unpredictable (AU, random rhythm). The number of pulses per rhythm was roved to further manipulate predictability. Stimuli also varied in presentation rate (2.5, 4.5, 6.5 Hz) and target speech phase (in-phase, 0{degrees}; out-of-phase, 90{degrees}, 180{degrees}) relative to the preceding entraining rhythm. We also measured participants musical pulse continuation and standardized speech-in-noise perception abilities. We did not observe any effects of stimulus rhythm, rate, or target phase on target speech identification accuracy. However, reaction times were slowest at the nominal speech rate (4.5 Hz) and were most disrupted by out-of-phase presentations following the PP rhythm. Double-vowel task performance was associated with stronger musical pulse continuation abilities, but not speech-in-noise perception. Our results support the notion that entraining rhythmic cues rely on top-down processing but are relatively muted when stimulus predictability is unknown. Additionally, we find that individual differences in musical pulse perception may underlie the benefits of rhythmic cueing on subsequent speech perception.
Rotaru, I.; Geirnaert, S.; Heintz, N.; Bertrand, A.; Francart, T.
Show abstract
Selective auditory attention decoding (AAD) enables tracking which of multiple concurrent speakers a listener attends to and is a key building block for neuro-steered hearing devices. While AAD integrated in a closed-loop system with real-time neurofeedback (NFB) is hypothesized to improve decoding through neural adaptation and error-correction behaviour, the short-term behavioral and algorithmic impact of such a bilateral human-machine interaction remains poorly understood. Here we evaluated the effects of NFB on AAD accuracy and user experience in a single-session AAD paradigm with online NFB involving nineteen participants. They performed a selective listening task with enforced attention switches across four conditions: open-loop (OL), closed-loop with auditory gain feedback (CLA), closed-loop with visual feedback (CLV), and a condition with pseudo-auditory gain control (psCLA) decoupled from the participants individual neural activity. AAD was performed online using both subject-specific and subject-independent linear decoders on 5 s sliding windows, followed by Hidden Markov Model post-processing. Online analysis showed comparable decoding performance across all conditions. However, offline posthoc analysis using subject-independent decoders revealed that AAD accuracy in the CLA condition was significantly lower than in the OL baseline. Subjectively, participants reported that CLA was significantly more distracting and required higher switching effort. Crucially, a causal analysis of the psCLA condition found no robust evidence that higher audio gains inherently improve decoding accuracy. Our results demonstrate that within a single-session paradigm with rapidly varying feedback cues, auditory neurofeedback may degrade AAD performance by increasing cognitive load and distraction. These findings suggest that suboptimal feedback can impede rather than facilitate learning. We conclude that more accurate and stable decoders and longitudinal, multi-session training protocols are likely essential prerequisites for achieving beneficial neurofeedback effects in closed-loop auditory attention systems.
Jedrzejczak, W.; Kochanek, K.; Skarzynski, H.
Show abstract
IntroductionAuditory brainstem response (ABR) is a standard objective method for estimating hearing threshold, especially in patients who cannot reliably participate in behavioral audiometry. However, ABR interpretation is usually performed by an expert. This study evaluated whether two general-purpose artificial intelligence (AI) multimodal large language model (LLM) chatbots, ChatGPT and Qwen, can accurately estimate ABR hearing thresholds from ABR waveform images. The accuracy was measured by comparisons with the judgements of 3 expert audiologists. MethodsA total of 500 images each containing several ABR waveforms recorded at different stimulus intensities were analyzed. Three expert audiologists established the reference auditory thresholds based on visual identification of wave V at the lowest stimulus intensity, with the most frequent judgment among the three used as the reference. Each waveform image was independently submitted to ChatGPT (version 5.1) and Qwen (version 3Max) using the same standardized prompt and without additional clinical context. Agreement with the expert thresholds was assessed as mean errors and correlations. Sensitivity and specificity for detecting hearing loss (>20 dB nHL) were also calculated. In cases where the AI and expert thresholds nominally matched, corresponding latency measures were also compared. ResultsAuditory thresholds derived from both LLMs correlated strongly with expert opinion, with Pearson r = 0.954 for ChatGPT and r = 0.958 for Qwen. ChatGPT showed a mean error of +5.5 dB and Qwen showed a mean error of -2.7 dB. Exact nominal agreement with expert values was achieved in 34.6% of ChatGPT estimates and 35.6% of Qwen estimates; agreement within {+/-}10 dB was observed in 75.6% and 80.0% of cases, respectively. For hearing-loss classification, ChatGPT achieved 100% sensitivity but low specificity (20.4%), whereas Qwen showed a more balanced profile with 91.6% sensitivity and 67.5% specificity. Curiously, estimates of wave V latency were markedly poor for both LLMs, with systematic underestimation and weak correlations with the expert judgements. ConclusionChatGPT and Qwen demonstrated a moderate ability to estimate ABR thresholds from waveform images, although their performance was not good enough for independent clinical use. Both models captured general patterns of hearing loss severity, but there was systematic bias, limited specificity and sensitivity balance, and poor latency estimation. General-purpose multimodal LLMs may have potential as assistive or preliminary tools, but clinically reliable ABR interpretation will likely require specialized, domain-trained AI systems with expert oversight.
De Vreese, S.; Graïc, J.-M.; Mazzariol, S.; Huggenberger, S.; Fogli, M.; Luzzati, F.; Corona, C.; Favole, A.; Cerda-Domenech, M.; Frigola, J.; Andre, M.
Show abstract
The peripheral auditory system of dolphins comprises specialised bony, fatty, vascular, and neural structures adapted for underwater hearing and diving physiology. These include the external ear canal, acoustic fat bodies, sinuses, and associated neurovascular networks, which together support sound conduction, protection, and possibly sensory functions. Despite advances in gross anatomical description, the detailed integration of these tissues, particularly the innervation, neurovascular organisation, and their functional implications, remains poorly understood. Previous studies have described the presence of sensory nerve formations and vascular plexuses, but their arrangement, connectivity, and relation to each other are unresolved. Here, we combine macroscopic dissection, DICE-{micro}CT, histology, and high-resolution confocal microscopy to characterise several neurovascular and sensory components of the dolphin peripheral auditory system in several delphinid species. Macroscopic dissection and DICE-{micro}CT revealed the traditional acoustic fat body distribution with detailed morphology of the posterolateral extension that is not well-known. The cranial nerve distribution, and specifically the mandibular nerve branching patterns, are described in detail. Confocal microscopy uncovered a stratified neurovascular plexus around the external ear canal with a complex sensory system comprising lamellar corpuscles, Merkel cell-neurite complexes, and intraepithelial nerve fibres. Notably, the lamellar corpuscles formed a continuous, three-dimensional neural network with frequent merging and splitting of axonal bundles, shared perineuria, and vascular integration, features not observed in previous studies. Our findings demonstrate that the dolphin external ear canal and surrounding structures form a sophisticated, multimodal somatosensory organ, integrating structural, vascular, and neural specialisations likely adapted for proprioceptive mechanosensation in the aquatic environment. This study provides insights into the integration of the various components of the peripheral hearing apparatus. Future studies integrating anatomical, electrophysiological, and biomechanical approaches are needed to fully elucidate these adaptations.
Manasevich, V.; Kostanian, D.; Rogachev, A.; Sysoeva, O.
Show abstract
Rise time (RT) is considered to be one of the most significant acoustical characteristics of auditory speech stimuli. A substantial amount of data has been accumulated on the neurophysiological mechanisms of RT processing under different conditions and in different groups of people, but these data have not been systematised. This review focuses on studies that have investigated electroencephalographic (EEG) markers of RT sensitivity. The present literature search was conducted according to the PRISMA statement in PubMed, Web of Science and APA PsychInfo databases. The resultant review comprised 37 studies that considered diverse aspects of RT processing. The review describes the main stimulation parameters affecting electrophysiological markers of RT processing reflected in different components of event-related potentials, brainstem responses and cortical rhythmic activity. The main finding of this review is that the rise time prolongation leads to a decrease in the amplitude of the main ERP components and an increase in their latencies. However, the sensitivity of the EEG markers varied with the earliest components tracking the subtle difference (few tens of microseconds), while the later components coding the larger one (up to 500 ms). Nevertheless, the observed effects may vary and depend on some aspects of the experimental paradigm, age of participants and speech-related problems. Future research may benefit by addressing understudied clinical groups and ERP components such as P1 and N2, dominated in children.
Hunter, L. L.; Feeney, M. P.; Fitzpatrick, D.; Keefe, D. H.
Show abstract
ObjectivesThe overall goal of this study was to assess tympanometric and ambient wideband acoustic immittance (WAI) tests and wideband acoustic reflex thresholds (ART) in well-baby and newborn intensive care (NICU) cohorts with three specific objectives: 1) Assess predictive accuracy for WBT and ART for conductive dysfunction in ears referring on the first or second stages of newborn hearing screening; 2) Identify inadequate tests likely due to probe blockages or leaks; and 3) Assess prediction models separately for well-baby and NICU screening outcomes. DesignProspective, observational study of full-term (n=514) and premature newborns (n=239) recruited from well-baby and NICU nursery birth hospital newborn hearing screening program. Wideband tympanometry, ambient absorbance, and acoustic reflexes were tested after Stage 1 transient otoacoustic emissions (TEOAE) screening. The reference standard for Pass or Refer groups was initially defined on the stage 1 TEOAE test result. Pass or Refer groups were then reassigned based on the stage 2 screening ABR for those who referred at Stage 1, and all NICU infants. Multivariate models were developed using reflectance and admittance variables to predict conductive dysfunction relative to the screening reference standard in a randomized sub-group of subjects at Stage 1 and Stage 2 screening. Classification accuracy was evaluated on a second, independent sub-group. Individual tests were classified as having inadequate probe fits if they had excessively low values of sound pressure level or susceptance (leak) or absorbance (blockage). ResultsDifferences in ambient absorbance for Pass v. Refer screening groups revealed the greatest differences and effect sizes occurring in frequency bins between 1.4-2 kHz. Screening failure at both Stage 1 and 2 was most accurately predicted by models using ambient absorbance and power level variables at frequencies between 1-2.8 kHz, including ARTs. Tympanometric admittance variables at the positive-pressure tail for frequencies between 1-2.8 kHz in combination with the ART were more accurate predictors than those at peak pressure or the negative-pressure tail. Multivariate models generalized well to an independent group of infants at both Stage 1 and 2 for both the ambient and tympanometric models. Ambient tests revealed more inadequate tests than tympanometric tests, primarily due to blocked probe tips. Exclusion of ears to detect probe leaks or blockages slightly improved the ambient prediction models, but did not affect tympanometric models. ConclusionWideband acoustic reflex tests improved all models for ambient and tympanometric absorbance. Multivariate prediction models developed for WAI tests were repeatable in an independent group of well and NICU infants, suggesting that the results are generalizable to these populations. Detection of probe blockage or leaks slightly improved prediction for ambient measures. Pressurized tests have the advantage of ensuring probe seals due to the need for a hermetic seal, thus are useful to ensure adequate probe insertion.
Garcia Ruiz, T.; Sanes, D. H.
Show abstract
Many perceptual skills improve with a few days of training. However, weeks or months of practice may be required to reach a level of expertise on complex tasks (Watson, 1980). Here, we explored how gerbils attain expertise on a difficult task: amplitude modulation (AM) rate discrimination at very shallow AM depths, similar to the depths used during vocal communication. Using an appetitive Go-Nogo procedure, we first trained 6 gerbils to perform an AM discrimination task (Nogo: 4 Hz; Go: 4.25-10 Hz) at a depth of 0 dB (re: 100% depth). Animals were then trained to perform AM discrimination at successively shallower depths, from -3 to -18 dB, requiring an average of 5-10 days of practice to reach a performance metric of d[≥]1 for each depth. Finally, we determined that AM discrimination thresholds were nearly identical between 0 to -12 dB, and only slightly elevated at -15 dB. Improvements in performance were accompanied by a large reduction in response time during procedural learning, and a gradual reduction of response time during perceptual learning, even as AM depth became shallower (i.e., more difficult). The shallowest depth at which gerbils displayed peak performance on the AM discrimination task is similar to their lowest AM depth detection thresholds. These results suggest performance on challenging auditory perceptual tasks require prolonged practice, and is accompanied by increased automaticity (i.e., lower response time) that stabilizes once expertise is achieved.